5 research outputs found

    Multi-objective optimization via equivariant deep hypervolume approximation

    Full text link
    Optimizing multiple competing objectives is a common problem across science and industry. The inherent inextricable trade-off between those objectives leads one to the task of exploring their Pareto front. A meaningful quantity for the purpose of the latter is the hypervolume indicator, which is used in Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the computational complexity for the calculation of the hypervolume scales unfavorably with increasing number of objectives and data points, which restricts its use in those common multi-objective optimization frameworks. To overcome these restrictions we propose to approximate the hypervolume function with a deep neural network, which we call DeepHV. For better sample efficiency and generalization, we exploit the fact that the hypervolume is scale-equivariant in each of the objectives as well as permutation invariant w.r.t. both the objectives and the samples, by using a deep neural network that is equivariant w.r.t. the combined group of scalings and permutations. We evaluate our method against exact, and approximate hypervolume methods in terms of accuracy, computation time, and generalization. We also apply and compare our methods to state-of-the-art multi-objective BO methods and EAs on a range of synthetic benchmark test cases. The results show that our methods are promising for such multi-objective optimization tasks.Comment: Updated with camera-ready version. Accepted at ICLR 202

    Predicting RP-LC retention indices of structurally unknown chemicals from mass spectrometry data

    No full text
    Non-target analysis combined with high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (ri) values for structurally (un)known chemicals based on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of ri values without the need for the exact structure of the chemicals, with an R^2 of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 ri units for the NORMAN (n=3131) and amide (n=604) test sets, respectively. This fragment based model showed comparable accuracy in r_i prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained a R^2 of 0.85 with and RMSE of 67

    Chemometric Strategies for Fully Automated Interpretive Method Development in Liquid Chromatography

    No full text
    The great potential gains in separation power and analysis time that can result from rigorously optimizing LC-MS and 2D-LC-MS methods for routine measurements has prompted many scientists to develop computer-aided method-development tools. The applicability of these has been proven in numerous applications, but their proliferation is still limited. Arguably, the majority of LC methods are still developed in a conventional manner, i.e. by analysts who rely on their knowledge and experience. In this work, a novel, open-source algorithm was developed for automated and interpretive method development of LC separations. A closed-loop workflow was constructed that interacted directly with the LC and ran unsupervised in an automated fashion. The algorithm was tested using two newly designed strategies. The first utilized retention modeling, whereas the second used the Bayesian-optimization machine-learning approach. In both cases, the algorithm could arrive within ten iterations at an optimum of the objective function, which included resolution and measurement time. The design of the algorithm was modular, so as to facilitate compatibility with previous works in literature and its performance thus hinged on each module (e.g., signal processing, choice of retention model, objective function). Key focus areas for further improvement were identified. Bayesian optimization did not require any peak tracking or retention modeling. Accurate prediction of elution profiles was found to be indispensable for the strategy using retention modeling. This is the first interpretive algorithm demonstrated with complex samples. Peak tracking was conducted using UV-Vis absorbance detection, but use of MS detection is expected to significantly broaden the applicability of the workflow

    Chemometric Strategies for Fully Automated Interpretive Method Development in Liquid Chromatography

    Get PDF
    The majority of liquid chromatography (LC) methods are still developed in a conventional manner, that is, by analysts who rely on their knowledge and experience to make method development decisions. In this work, a novel, open-source algorithm was developed for automated and interpretive method development of LC(-mass spectrometry) separations ("AutoLC"). A closed-loop workflow was constructed that interacted directly with the LC system and ran unsupervised in an automated fashion. To achieve this, several challenges related to peak tracking, retention modeling, the automated design of candidate gradient profiles, and the simulation of chromatograms were investigated. The algorithm was tested using two newly designed method development strategies. The first utilized retention modeling, whereas the second used a Bayesian-optimization machine learning approach. In both cases, the algorithm could arrive within 4-10 iterations (i.e., sets of method parameters) at an optimum of the objective function, which included resolution and analysis time as measures of performance. Retention modeling was found to be more efficient while depending on peak tracking, whereas Bayesian optimization was more flexible but limited in scalability. We have deliberately designed the algorithm to be modular to facilitate compatibility with previous and future work (e.g., previously published data handling algorithms)
    corecore